AITopics | covariate matrix

Can we globally optimize cross validation loss in ridge regression

Neural Information Processing SystemsApr-27-2026, 03:01:02 GMT

Models like LASSO and ridge regression are extensively used in practice due to their interpretability, ease of use, and strong theoretical guarantees. Crossvalidation (CV) is widely used for hyperparameter tuning in these models, but do practical optimization methods minimize the true out-of-sample loss? A recent line of research promises to show that the optimum of the CV loss matches the optimum of the out-of-sample loss (possibly after simple corrections). It remains to show how tractable it is to minimize the CV loss. In the present paper, we show that, in the case of ridge regression, the CV loss may fail to be quasiconvex and thus may have multiple local optima. We can guarantee that the CV loss is quasiconvex in at least one case: when the spectrum of the covariate matrix is nearly flat and the noise in the observed responses is not too high. More generally, we show that quasiconvexity status is independent of many properties of the observed data (response norm, covariate-matrix right singular vectors, and singular-value scaling) and has a complex dependence on the few that remain. We empirically confirm our theory using simulated experiments.

artificial intelligence, machine learning, optimization problem, (19 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.82)

Add feedback

On Robustness of Principal Component Regression

Anish Agarwal, Devavrat Shah, Dennis Shen, Dogyoon Song

Neural Information Processing SystemsFeb-12-2026, 23:01:56 GMT

Neural Information Processing Systems http://nips.cc/

covariate, matrix, pcr, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada (0.04)
Europe > Spain > Basque Country (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

cc298d5bc587e1b650f80e10449ee9d5-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 05:07:42 GMT

assumption, matrix, quasiconvexity, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Data Science (0.68)

Add feedback

On Robustness of Principal Component Regression

Anish Agarwal, Devavrat Shah, Dennis Shen, Dogyoon Song

Neural Information Processing SystemsOct-3-2025, 05:48:01 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, covariate, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Robust Logistic Regression and Classification

Neural Information Processing SystemsSep-30-2025, 09:12:11 GMT

We consider logistic regression with arbitrary outliers in the covariate matrix. We propose a new robust logistic regression algorithm, called RoLR, that estimates the parameter through a simple linear programming procedure. We prove that RoLR is robust to a constant fraction of adversarial outliers. To the best of our knowledge, this is the first result on estimating logistic regression model when the covariate matrix is corrupted with any performance guarantees. Besides regression, we apply RoLR to solving binary classification problems where a fraction of training samples are corrupted.

electronic proceedings, name change, robust logistic regression and classification, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Robust Logistic Regression and Classification

Jiashi Feng, Huan Xu, Shie Mannor, Shuicheng Yan

Neural Information Processing SystemsFeb-9-2025, 06:57:30 GMT

We consider logistic regression with arbitrary outliers in the covariate matrix. We propose a new robust logistic regression algorithm, called RoLR, that estimates the parameter through a simple linear programming procedure. We prove that RoLR is robust to a constant fraction of adversarial outliers. To the best of our knowledge, this is the first result on estimating logistic regression model when the covariate matrix is corrupted with any performance guarantees. Besides regression, we apply RoLR to solving binary classification problems where a fraction of training samples are corrupted.

artificial intelligence, machine learning, regression, (18 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
Asia > Middle East > Israel (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Sampling-guided Heterogeneous Graph Neural Network with Temporal Smoothing for Scalable Longitudinal Data Imputation

Zhang, Zhaoyang, Chen, Ziqi, Liu, Qiao, Xie, Jinhan, Zhu, Hongtu

arXiv.org Artificial IntelligenceNov-7-2024

In this paper, we propose a novel framework, the Sampling-guided Heterogeneous Graph Neural Network (SHT-GNN), to effectively tackle the challenge of missing data imputation in longitudinal studies. Unlike traditional methods, which often require extensive preprocessing to handle irregular or inconsistent missing data, our approach accommodates arbitrary missing data patterns while maintaining computational efficiency. SHT-GNN models both observations and covariates as distinct node types, connecting observation nodes at successive time points through subject-specific longitudinal subnetworks, while covariate-observation interactions are represented by attributed edges within bipartite graphs. By leveraging subject-wise mini-batch sampling and a multi-layer temporal smoothing mechanism, SHT-GNN efficiently scales to large datasets, while effectively learning node representations and imputing missing data. Extensive experiments on both synthetic and real-world datasets, including the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, demonstrate that SHT-GNN significantly outperforms existing imputation methods, even with high missing data rates. The empirical results highlight SHT-GNN's robust imputation capabilities and superior performance, particularly in the context of complex, large-scale longitudinal data.

dataset, longitudinal data, longitudinal subnetwork, (14 more...)

arXiv.org Artificial Intelligence

2411.04899

Country:

North America > United States > North Carolina (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.69)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Robust Logistic Regression and Classification

Neural Information Processing SystemsMar-13-2024, 09:30:37 GMT

We consider logistic regression with arbitrary outliers in the covariate matrix. We propose a new robust logistic regression algorithm, called RoLR, that estimates the parameter through a simple linear programming procedure. We prove that RoLR is robust to a constant fraction of adversarial outliers. To the best of our knowledge, this is the first result on estimating logistic regression model when the covariate matrix is corrupted with any performance guarantees. Besides regression, we apply RoLR to solving binary classification problems where a fraction of training samples are corrupted.

outlier, regression, rolr, (16 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
Asia > Middle East > Israel (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression

Stephenson, William T., Frangella, Zachary, Udell, Madeleine, Broderick, Tamara

arXiv.org Machine LearningJul-19-2021

Models like LASSO and ridge regression are extensively used in practice due to their interpretability, ease of use, and strong theoretical guarantees. Cross-validation (CV) is widely used for hyperparameter tuning in these models, but do practical optimization methods minimize the true out-of-sample loss? A recent line of research promises to show that the optimum of the CV loss matches the optimum of the out-of-sample loss (possibly after simple corrections). It remains to show how tractable it is to minimize the CV loss. In the present paper, we show that, in the case of ridge regression, the CV loss may fail to be quasiconvex and thus may have multiple local optima. We can guarantee that the CV loss is quasiconvex in at least one case: when the spectrum of the covariate matrix is nearly flat and the noise in the observed responses is not too high. More generally, we show that quasiconvexity status is independent of many properties of the observed data (response norm, covariate-matrix right singular vectors and singular-value scaling) and has a complex dependence on the few that remain. We empirically confirm our theory using simulated experiments.

assumption 3, matrix, quasiconvexity, (14 more...)

arXiv.org Machine Learning

2107.09194

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

Add feedback

Covariate-assisted Sparse Tensor Completion

Ibriga, Hilda S, Sun, Will Wei

arXiv.org Machine LearningMar-10-2021

We aim to provably complete a sparse and highly-missing tensor in the presence of covariate information along tensor modes. Our motivation comes from online advertising where users click-through-rates (CTR) on ads over various devices form a CTR tensor that has about 96% missing entries and has many zeros on non-missing entries, which makes the standalone tensor completion method unsatisfactory. Beside the CTR tensor, additional ad features or user characteristics are often available. In this paper, we propose Covariate-assisted Sparse Tensor Completion (COSTCO) to incorporate covariate information for the recovery of the sparse tensor. The key idea is to jointly extract latent components from both the tensor and the covariate matrix to learn a synthetic representation. Theoretically, we derive the error bound for the recovered tensor components and explicitly quantify the improvements on both the reveal probability condition and the tensor recovery accuracy due to covariates. Finally, we apply COSTCO to an advertisement dataset consisting of a CTR tensor and ad covariate matrix, leading to 23% accuracy improvement over the baseline. An important by-product is that ad latent components from COSTCO reveal interesting ad clusters, which are useful for better ad targeting.

algorithm, matrix, tensor, (16 more...)

arXiv.org Machine Learning

2103.06428

Country: